Search CORE

444 research outputs found

GROUP-LASSO ESTIMATION IN HIGH-DIMENSIONAL FACTOR MODELS WITH STRUCTURAL BREAKS

Author: Song Yujie
Publication venue: Scholarship at UWindsor
Publication date: 17/10/2018
Field of study

In this major paper, we study the influence of structural breaks in the financial market model with high-dimensional data. We present a model which is capable of detecting changes in factor loadings, determining the number of factors and detecting the break date. We consider the case where the break date is both known and unknown and identify the type of instability. For the unknown break date case, we propose a group-LASSO estimator to determine the number of pre- and post-break factors, the break date and the existence of instability of factor loadings when the number of factor is constant. We also present the asymptotic properties of penalized least square estimator with both the cross-sections and the time dimensions tend to infinity. Further, we develop a cross-validation procedure to obtain the tuning parameters to fine-tune the penalty terms and use the least square approach to estimate the break date after the number of factors is obtained. We also present a Monte Carlo simulation to evaluate the performance of the proposed procedure and analyze real data from 2007-09 of Great Recession. The proposed procedure generally detects the break date correctly during the Great Recession while the procedure performs relatively poorly in estimating the number of factors in the pre- and post-break date case

Scholarship at UWindsor

Solving the boolean satisfiability problem using multilevel techniques

Author: Salih Sirar
Song Yujie
Publication venue: Universitetet i Agder / University of Agder
Publication date: 01/01/2011
Field of study

There are many complex problems in computer science that occur in knowledge-representation (artificial thinking), artificial learning, Very Large Scale Integration (VLSI) design, security protocols and other areas. These complex problems may be deduced into satisfiability problems where the Boolean Satisfiability Problem (SAT) may be applied. This deduction is made in order to simplify complex problems into a specific propositional logic problem. The SAT problem is the most well-known nondeterministic polynomial time (NP) complete problem in computer science. It is a Boolean expression which is composed of a specific amount of variables (literals), clauses that contain disjunctions of the literals and conjunctions of the clauses. The literals have the logical values TRUE and FALSE, the task is to find a truth assignment that makes the entire expression TRUE. The main goal of the thesis is to solve the SAT problem using a clustering technique - Multilevel - combined first with Tabu Search and combined thereafter with finite Learning Automata. Tabu Search and finite Learning Automata are two very efficient approaches that have been used to solve SAT. Benchmark experiments are conducted in order to disclose whether combining Multilevel with existing solutions to solve SAT will provide better results - than the two mentioned approaches alone - mainly in terms of computational efficienc

NORA - Norwegian Open Research Archives

Agder University Research Archive

Two-Stage Bagging Pruning for Reducing the Ensemble Size and Improving the Classification Performance

Author: Chen Bi
Jiang Bo
Shan Guogen
Song Yujie
Zhang Hua
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2019
Field of study

Ensemble methods, such as the traditional bagging algorithm, can usually improve the performance of a single classifier. However, they usually require large storage space as well as relatively time-consuming predictions. Many approaches were developed to reduce the ensemble size and improve the classification performance by pruning the traditional bagging algorithms. In this article, we proposed a two-stage strategy to prune the traditional bagging algorithm by combining two simple approaches: accuracy-based pruning (AP) and distance-based pruning (DP). These two methods, as well as their two combinations, “AP+DP” and “DP+AP” as the two-stage pruning strategy, were all examined. Comparing with the single pruning methods, we found that the two-stage pruning methods can furthermore reduce the ensemble size and improve the classification. “AP+DP” method generally performs better than the “DP+AP” method when using four base classifiers: decision tree, Gaussian naive Bayes, K-nearest neighbor, and logistic regression. Moreover, as compared to the traditional bagging, the two-stage method “AP+DP” improved the classification accuracy by 0.88%, 4.06%, 1.26%, and 0.96%, respectively, averaged over 28 datasets under the four base classifiers. It was also observed that “AP+DP” outperformed other three existing algorithms Brag, Nice, and TB assessed on 8 common datasets. In summary, the proposed two-stage pruning methods are simple and promising approaches, which can both reduce the ensemble size and improve the classification accuracy

Directory of Open Access Journals

University of Nevada, Las Vegas Repository

A novel fault diagnosis for hydraulic pump based on EEMD-LTSA and PNN

Author: Chen Lu
Dengwei Song
Yujie Cheng
Publication venue: 'JVE International Ltd.'
Publication date: 08/12/2016
Field of study

The hydraulic pump is the core part of the hydraulic system and impacts the performance of hydraulic directly, thus the diagnosis for hydraulic is crucial. To realize the diagnosis for hydraulic pump, a method utilizing the vibration signal which varies with the performance is proposed. First, ensemble empirical mode decomposition (EEMD) is used to decompose the original signal into finite intrinsic mode functions (IMFs), and then the energy values are extracted to form the feature vector. Second, local tangent space alignment (LTSA), a manifold learning method, is applied in dimension reduction. Third, probabilistic neural network (PNN) is employed as the classifier to recognize the fault pattern. Finally, the effectiveness of the proposed method is validated by the experimental data with different faults

Enhancing Low-Precision Sampling via Stochastic Gradient Hamiltonian Monte Carlo

Author: Chen Yujie
Song Qifan
Wang Ziyi
Zhang Ruqi
Publication venue
Publication date: 24/10/2023
Field of study

Low-precision training has emerged as a promising low-cost technique to enhance the training efficiency of deep neural networks without sacrificing much accuracy. Its Bayesian counterpart can further provide uncertainty quantification and improved generalization accuracy. This paper investigates low-precision sampling via Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) with low-precision and full-precision gradient accumulators for both strongly log-concave and non-log-concave distributions. Theoretically, our results show that, to achieve

\epsilon

-error in the 2-Wasserstein distance for non-log-concave distributions, low-precision SGHMC achieves quadratic improvement (

\widetilde{\mathbf{O}}\left({\epsilon^{-2}{\mu^*}^{-2}\log^2\left({\epsilon^{-1}}\right)}\right)

) compared to the state-of-the-art low-precision sampler, Stochastic Gradient Langevin Dynamics (SGLD) (

\widetilde{\mathbf{O}}\left({{\epsilon}^{-4}{\lambda^{*}}^{-1}\log^5\left({\epsilon^{-1}}\right)}\right)

). Moreover, we prove that low-precision SGHMC is more robust to the quantization error compared to low-precision SGLD due to the robustness of the momentum-based update w.r.t. gradient noise. Empirically, we conduct experiments on synthetic data, and {MNIST, CIFAR-10 \& CIFAR-100} datasets, which validate our theoretical findings. Our study highlights the potential of low-precision SGHMC as an efficient and accurate sampling method for large-scale and resource-limited machine learning

arXiv.org e-Print Archive

Extrinsic Factors Affecting the Accuracy of Biomedical NER

Author: Li Zhiyi
Park Jungyeul
Song Yujie
Zhang Shengjie
Publication venue
Publication date: 29/05/2023
Field of study

Biomedical named entity recognition (NER) is a critial task that aims to identify structured information in clinical text, which is often replete with complex, technical terms and a high degree of variability. Accurate and reliable NER can facilitate the extraction and analysis of important biomedical information, which can be used to improve downstream applications including the healthcare system. However, NER in the biomedical domain is challenging due to limited data availability, as the high expertise, time, and expenses are required to annotate its data. In this paper, by using the limited data, we explore various extrinsic factors including the corpus annotation scheme, data augmentation techniques, semi-supervised learning and Brill transformation, to improve the performance of a NER model on a clinical text dataset (i2b2 2012, \citet{sun-rumshisky-uzuner:2013}). Our experiments demonstrate that these approaches can significantly improve the model's F1 score from original 73.74 to 77.55. Our findings suggest that considering different extrinsic factors and combining these techniques is a promising approach for improving NER performance in the biomedical domain where the size of data is limited

arXiv.org e-Print Archive

Breast Cancer Stem Cells

Author: Erwei Song
Fengyan Yu
Jieqiong Liu
Qiang Liu
Yujie Liu
Publication venue: 'IntechOpen'
Publication date: 14/12/2011
Field of study

IntechOpen

Identification of Benzo[a]pyrene-metabolizing bacteria in forest soils by using DNA-based stable-isotope probing

Author: Jiang Longfei
Luo Chunling
Song Mengke
Wang Yujie
Zhang Dayi
Zhang Gan
Publication venue: 'American Society for Microbiology'
Publication date: 08/08/2015
Field of study

DNA-based stable-isotope probing (DNA-SIP) was used in this study to investigate the uncultivated bacteria with benzo[a]pyrene (BaP) metabolism capacities in two Chinese forest soils (Mt. Maoer in Heilongjiang Province and Mt. Baicaowa in Hubei Province). We characterized three different phylotypes with responsibility for BaP degradation, none of which were previously reported as BaP-degrading microorganisms by SIP. In Mt. Maoer soil microcosms, the putative BaP degraders were classified as belonging to the genus Terrimonas (family Chitinophagaceae, order Sphingobacteriales), whereas Burkholderia spp. were the key BaP degraders in Mt. Baicaowa soils. The addition of metabolic salicylate significantly increased BaP degradation efficiency in Mt. Maoer soils, and the BaP-metabolizing bacteria shifted to the microorganisms in the family Oxalobacteraceae (genus unclassified). Meanwhile, salicylate addition did not change either BaP degradation or putative BaP degraders in Mt. Baicaowa. Polycyclic aromatic hydrocarbon ring-hydroxylating dioxygenase (PAH-RHD) genes were amplified, sequenced, and quantified in the DNA-SIP (13)C heavy fraction to further confirm the BaP metabolism. By illuminating the microbial diversity and salicylate additive effects on BaP degradation across different soils, the results increased our understanding of BaP natural attenuation and provided a possible approach to enhance the bioremediation of BaP-contaminated soils

PubMed Central

Lancaster E-Prints

Interfacing Nickel Nitride and Nickel Boosts Both Electrocatalytic Hydrogen Evolution and Oxidation Reactions

Author: Han Guanqun
Li Wei
Liao Peilin
Song Fuzhan
Sun Yujie
Yang Jiaqi
Publication venue: Hosted by Utah State University Libraries
Publication date: 01/10/2018
Field of study

Electrocatalysts of the hydrogen evolution and oxidation reactions (HER and HOR) are of critical importance for the realization of future hydrogen economy. In order to make electrocatalysts economically competitive for large-scale applications, increasing attention has been devoted to developing noble metal-free HER and HOR electrocatalysts especially for alkaline electrolytes due to the promise of emerging hydroxide exchange membrane fuel cells. Herein, we report that interface engineering of Ni3N and Ni results in a unique Ni3N/Ni electrocatalyst which exhibits exceptional HER/HOR activities in aqueous electrolytes. A systematic electrochemical study was carried out to investigate the superior hydrogen electrochemistry catalyzed by Ni3N/Ni, including nearly zero overpotential of catalytic onset, robust long-term durability, unity Faradaic efficiency, and excellent CO tolerance. Density functional theory computations were performed to aid the understanding of the electrochemical results and suggested that the real active sites are located at the interface between Ni3N and Ni

Directory of Open Access Journals

DigitalCommons@USU